flash-attn2: Add flash_attn_with_kvcache support for XPU by YangKai0616 · Pull Request #534 · huggingface/kernels-community

YangKai0616 · 2026-04-13T13:10:37Z

Functionality check passed, still need to test the performance.

YangKai0616 · 2026-05-09T08:38:41Z

In transformers, PR huggingface/transformers#44379 enables the flash_attention_with_kvcache feature to speed up performance for paged+decode cases. This PR adds that same feature for XPU, which gives close to a 2x performance boost.

YangKai0616 force-pushed the fa-kvcache branch from ab78777 to 0e544f0 Compare April 27, 2026 05:55

YangKai0616 force-pushed the fa-kvcache branch from 4bd5dbc to 23ef01f Compare May 9, 2026 01:31

YangKai0616 added 5 commits May 9, 2026 02:07

Rebase and change kDropMaskMax

11e58a5

Rebase

3f04867

Roll back the prefetch operation to fix the paged kvcache bugs

5217798

refine

468dd6d

Add fused rotary support for XPU kvcache

7f7267d

YangKai0616 force-pushed the fa-kvcache branch from 23ef01f to 7f7267d Compare May 9, 2026 02:09

YangKai0616 added 2 commits May 9, 2026 05:33

Remove fallback

b50ffa5

Remove comments

0dcc9c5

YangKai0616 marked this pull request as ready for review May 9, 2026 08:38

YangKai0616 requested a review from drbh as a code owner May 9, 2026 08:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

flash-attn2: Add flash_attn_with_kvcache support for XPU#534

flash-attn2: Add flash_attn_with_kvcache support for XPU#534
YangKai0616 wants to merge 7 commits intohuggingface:mainfrom
YangKai0616:fa-kvcache

YangKai0616 commented Apr 13, 2026

Uh oh!

YangKai0616 commented May 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

YangKai0616 commented Apr 13, 2026

Uh oh!

YangKai0616 commented May 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant